Skip to content

Conversation

@Li-shi-ling
Copy link
Contributor

@Li-shi-ling Li-shi-ling commented Jan 10, 2026

标题

fix: ensure atomic creation of knowledge base with proper cleanup on failure

描述

Motivation / 动机

修复了创建知识库时的事务一致性问题。Issue #4403:创建新的知识库实例时,如果没有提供embedding_provider_id,会导致创建了kb数据库记录但是没有创建kb_helper。当下次创建同一个名称时出现SQLite唯一约束冲突,但实际上这个数据库并没有被完整创建。

Modifications / 改动点

  • 修改了 astrbot/core/knowledge_base/kb_mgr.py 中的 create_kb 方法
  • 实现了参数预校验,确保创建知识库前提供必要的 embedding_provider_id
  • 添加了同名知识库检查,避免重复创建
  • 改进了异常处理机制,当KBHelper初始化失败时,使用同一个数据库会话回滚已创建的记录
  • 确保数据一致性,避免留下不完整的数据库记录

修改后的关键代码逻辑:

# 1. 参数预校验
if embedding_provider_id is None:
    raise ValueError("创建知识库时必须提供embedding_provider_id")

# 2. 同名检查
existing_kb = await self.kb_db.get_kb_by_name(kb_name)
if existing_kb is not None:
    raise ValueError(f"知识库名称 '{kb_name}' 已存在")

# 3. 创建数据库记录后,如果KBHelper初始化失败,回滚
try:
    kb_helper = KBHelper(...)
    await kb_helper.initialize()
except Exception:
    # 使用同一个会话来删除,确保事务一致性
    await session.refresh(kb)
    await session.delete(kb)
    await session.commit()
    raise

验证步骤:

在1号示例,我展示了在触发KBHelper的initialize创建时出现报错,这个时候代码正常回滚删除没有创建完成的数据库,没有影响我的下一次同名创建
完整三号示例

在2号示例,我先使用了没有embedding_provider_id参数的创建请求,正常抛出错误
然后使用了有embedding_provider_id参数的创建请求,成功创建
然后在下一次同名创建,正常抛出错误
4号示例

我的测试插件代码(部分)

    @memorychain.command("kbep")
    async def get_ep(self, event: AstrMessageEvent):
        """获取ep列表"""
        ep_names = []
        p_ids = list(self.context.provider_manager.inst_map.keys())
        for p_id in p_ids:
            providers = self.context.get_provider_by_id(p_id)
            if isinstance(providers, EmbeddingProvider):
                ep_names.append(p_id)
        yield event.plain_result(f"能够使用的编码器列表:\n" + "\n".join(ep_names))
        logger.info(f"[memorychain] 能够使用的编码器列表:\n" + "\n".join(ep_names))


    @memorychain.command("kbcr")
    async def kb_create(self, event: AstrMessageEvent, kb_name:str, ep_names:str):
        """创建数据库"""
        await self.context.kb_manager.create_kb(
            kb_name = kb_name,
            embedding_provider_id = ep_names
        )
        yield event.plain_result(f"成功创建数据库:{kb_name}")
        logger.info(f"[memorychain] 成功创建数据库:{kb_name}")


    @memorychain.command("kbcr_cs")
    async def kb_create_cs(self, event: AstrMessageEvent, kb_name:str):
        """创建数据库"""
        await self.context.kb_manager.create_kb(
            kb_name = kb_name
        )
        yield event.plain_result(f"成功创建数据库:{kb_name}")
        logger.info(f"[memorychain] 成功创建数据库:{kb_name}")

Checklist / 检查清单

  • 😊 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。/ If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    • 没有加入新功能,只在原本处理逻辑上面加了更完善的处理
  • 👀 我的更改经过了良好的测试,并已在上方提供了"验证步骤"和"运行截图"。/ My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    • 已在本地环境测试了各种边界情况
    • 提供了详细的验证步骤
  • 🤓 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到了 requirements.txtpyproject.toml 文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    • 本次修改仅修改现有逻辑,未引入新的依赖库
  • 😮 我的更改没有引入恶意代码。/ My changes do not introduce malicious code.
    • 代码修改集中在异常处理和事务一致性,不包含任何恶意代码

额外说明

  • 修复方法遵循了Issue评论中dosubot的建议方案
  • 使用了项目中已有的 get_kb_by_name() 方法,保持代码一致性
  • 修改后的代码更加健壮,能够处理各种异常情况

Summary by Sourcery

通过验证输入、防止名称重复以及清理创建失败的记录,确保知识库实例创建的原子性和一致性。

Bug 修复:

  • 当 KB helper 初始化失败时,防止不完整的知识库记录残留在数据库中。
  • 通过要求必须提供 embedding provider,并在创建前拒绝重复的知识库名称,避免触发 SQLite 唯一约束错误。

增强内容:

  • 增加对必需参数的预验证,并在知识库创建无法完成时提供明确的错误报告。
Original summary in English

Summary by Sourcery

Ensure atomic and consistent creation of knowledge base instances by validating inputs, preventing duplicate names, and cleaning up failed creations.

Bug Fixes:

  • Prevent incomplete knowledge base records being left in the database when KB helper initialization fails.
  • Avoid SQLite unique constraint violations by requiring an embedding provider and rejecting duplicate knowledge base names before creation.

Enhancements:

  • Add pre-validation of required parameters and explicit error reporting when knowledge base creation cannot be completed.

…failure

- Added pre-validation for embedding_provider_id parameter
- Added check for existing knowledge base with same name
- Implemented proper rollback mechanism when KBHelper initialization fails
- Uses same session for cleanup to ensure data consistency
- Fixes AstrBotDevs#4403
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. feature:knowledge-base The bug / feature is about knowledge base labels Jan 10, 2026
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - 我发现了 3 个问题,并且给出了一些整体性的反馈:

  • 在插入前对 existing_kb 做预检查会引入竞态窗口;建议依赖数据库层面对 kb_name 的唯一约束,并在违反唯一约束时捕获数据库的完整性错误,再将其映射为对用户友好的异常,而不是先单独查一次是否存在。
  • 为了让整个操作保持完全原子性,可以考虑在 KBHelper 初始化之前避免提交:通过使用 session.flush() 获取 kb_id,只在 kb_helper.initialize() 成功之后再执行提交,而不是先提交、然后在失败时再删除并再次提交。
  • 在异常处理代码块中,await session.refresh(kb)delete(kb) 之前调用,在刚提交之后通常没有必要;你可以考虑移除这次 refresh 来简化清理逻辑。
面向 AI Agent 的提示词
Please address the comments from this code review:

## Overall Comments
- The pre-check for existing_kb before inserting introduces a race window; consider relying on a DB-level unique constraint for kb_name and mapping the integrity error to a user-facing exception instead of doing a separate existence query.
- To keep the operation fully atomic, you might avoid committing before KBHelper initialization by using `session.flush()` to obtain the kb_id and committing only after `kb_helper.initialize()` succeeds, rather than committing and then deleting/committing again on failure.
- Inside the exception handler, `await session.refresh(kb)` before `delete(kb)` seems unnecessary immediately after a commit; you can likely remove the refresh to simplify the cleanup logic.

## Individual Comments

### Comment 1
<location> `astrbot/core/knowledge_base/kb_mgr.py:100-101` </location>
<code_context>
+            raise ValueError("创建知识库时必须提供embedding_provider_id")
+
+        # 检查是否已存在同名知识库
+        existing_kb = await self.kb_db.get_kb_by_name(kb_name)
+        if existing_kb is not None:
+            raise ValueError(f"知识库名称 '{kb_name}' 已存在")
+
</code_context>

<issue_to_address>
**issue (bug_risk):** The name uniqueness check is vulnerable to race conditions under concurrent create calls.

This pre-insert check won’t prevent duplicates under concurrent creates: two requests can both see no existing KB and then both insert. If the DB enforces a unique index on `kb_name`, consider catching the resulting integrity error and mapping it to this `ValueError` so behavior remains correct and consistent under contention.
</issue_to_address>

### Comment 2
<location> `astrbot/core/knowledge_base/kb_mgr.py:117-126` </location>
<code_context>
+        kb_helper = None
</code_context>

<issue_to_address>
**suggestion:** The `kb_helper` sentinel and final RuntimeError appear to be unreachable and can be simplified away.

Because any construction error is re-raised in the `try/except` and successful initialization always assigns a non-None `kb_helper`, the `kb_helper = None` sentinel and the final `RuntimeError("知识库创建失败:未知错误")` are dead code. You can construct `kb_helper` inside the `try`, return it after the `with` block, and remove the extra check and fallback error.

Suggested implementation:

```python
        kb = KnowledgeBase(
            kb_name=kb_name,
            description=description,
            top_k_sparse=top_k_sparse if top_k_sparse is not None else 50,
            top_m_final=top_m_final if top_m_final is not None else 5,
        )

        async with self.kb_db.get_db() as session:
            session.add(kb)
            await session.commit()
            await session.refresh(kb)

```

Based on your comment, there is likely additional code in this function or method that:

1. Uses `kb_helper` after this block, e.g., something like:
   - A `try/except` that wraps the construction of `kb_helper`.
   - A `if kb_helper is None: raise RuntimeError("知识库创建失败:未知错误")` guard.

To fully apply your suggestion, you should also:

1. Move the construction of `kb_helper` inside the `try` block (or inside this `async with` block if appropriate for your logic), and ensure it always gets assigned on success.
2. Rely on the `try/except` to re-raise or translate any construction errors, so there is no need for a `None` sentinel.
3. Remove any `if kb_helper is None:` checks and the fallback `RuntimeError("知识库创建失败:未知错误")`, since they will be unreachable once `kb_helper` is always either:
   - Successfully created and returned, or
   - Throws an exception caught by your `except` block and re-raised.

You will need to apply these changes where the rest of the `kb_helper` logic is defined, which is not visible in the provided snippet.
</issue_to_address>

### Comment 3
<location> `astrbot/core/knowledge_base/kb_mgr.py:131-135` </location>
<code_context>
+                    chunker=CHUNKER,
+                )
+                await kb_helper.initialize()
+            except Exception:
+                await session.refresh(kb)
+                await session.delete(kb)
+                await session.commit()
+                raise
+        # 判断是否成功创建
</code_context>

<issue_to_address>
**suggestion (performance):** The extra `session.refresh(kb)` in the exception handler seems unnecessary and could be dropped.

Because `kb` is immediately deleted in the same session, the refresh doesn’t change the behavior and only adds an extra DB round-trip. Unless you specifically need the latest state before deleting, this call can be removed.

```suggestion
            except Exception:
                await session.delete(kb)
                await session.commit()
                raise
```
</issue_to_address>

Sourcery 对开源项目是免费的——如果你觉得这次 Review 有帮助,欢迎分享 ✨
帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续的 Review。
Original comment in English

Hey - I've found 3 issues, and left some high level feedback:

  • The pre-check for existing_kb before inserting introduces a race window; consider relying on a DB-level unique constraint for kb_name and mapping the integrity error to a user-facing exception instead of doing a separate existence query.
  • To keep the operation fully atomic, you might avoid committing before KBHelper initialization by using session.flush() to obtain the kb_id and committing only after kb_helper.initialize() succeeds, rather than committing and then deleting/committing again on failure.
  • Inside the exception handler, await session.refresh(kb) before delete(kb) seems unnecessary immediately after a commit; you can likely remove the refresh to simplify the cleanup logic.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The pre-check for existing_kb before inserting introduces a race window; consider relying on a DB-level unique constraint for kb_name and mapping the integrity error to a user-facing exception instead of doing a separate existence query.
- To keep the operation fully atomic, you might avoid committing before KBHelper initialization by using `session.flush()` to obtain the kb_id and committing only after `kb_helper.initialize()` succeeds, rather than committing and then deleting/committing again on failure.
- Inside the exception handler, `await session.refresh(kb)` before `delete(kb)` seems unnecessary immediately after a commit; you can likely remove the refresh to simplify the cleanup logic.

## Individual Comments

### Comment 1
<location> `astrbot/core/knowledge_base/kb_mgr.py:100-101` </location>
<code_context>
+            raise ValueError("创建知识库时必须提供embedding_provider_id")
+
+        # 检查是否已存在同名知识库
+        existing_kb = await self.kb_db.get_kb_by_name(kb_name)
+        if existing_kb is not None:
+            raise ValueError(f"知识库名称 '{kb_name}' 已存在")
+
</code_context>

<issue_to_address>
**issue (bug_risk):** The name uniqueness check is vulnerable to race conditions under concurrent create calls.

This pre-insert check won’t prevent duplicates under concurrent creates: two requests can both see no existing KB and then both insert. If the DB enforces a unique index on `kb_name`, consider catching the resulting integrity error and mapping it to this `ValueError` so behavior remains correct and consistent under contention.
</issue_to_address>

### Comment 2
<location> `astrbot/core/knowledge_base/kb_mgr.py:117-126` </location>
<code_context>
+        kb_helper = None
</code_context>

<issue_to_address>
**suggestion:** The `kb_helper` sentinel and final RuntimeError appear to be unreachable and can be simplified away.

Because any construction error is re-raised in the `try/except` and successful initialization always assigns a non-None `kb_helper`, the `kb_helper = None` sentinel and the final `RuntimeError("知识库创建失败:未知错误")` are dead code. You can construct `kb_helper` inside the `try`, return it after the `with` block, and remove the extra check and fallback error.

Suggested implementation:

```python
        kb = KnowledgeBase(
            kb_name=kb_name,
            description=description,
            top_k_sparse=top_k_sparse if top_k_sparse is not None else 50,
            top_m_final=top_m_final if top_m_final is not None else 5,
        )

        async with self.kb_db.get_db() as session:
            session.add(kb)
            await session.commit()
            await session.refresh(kb)

```

Based on your comment, there is likely additional code in this function or method that:

1. Uses `kb_helper` after this block, e.g., something like:
   - A `try/except` that wraps the construction of `kb_helper`.
   - A `if kb_helper is None: raise RuntimeError("知识库创建失败:未知错误")` guard.

To fully apply your suggestion, you should also:

1. Move the construction of `kb_helper` inside the `try` block (or inside this `async with` block if appropriate for your logic), and ensure it always gets assigned on success.
2. Rely on the `try/except` to re-raise or translate any construction errors, so there is no need for a `None` sentinel.
3. Remove any `if kb_helper is None:` checks and the fallback `RuntimeError("知识库创建失败:未知错误")`, since they will be unreachable once `kb_helper` is always either:
   - Successfully created and returned, or
   - Throws an exception caught by your `except` block and re-raised.

You will need to apply these changes where the rest of the `kb_helper` logic is defined, which is not visible in the provided snippet.
</issue_to_address>

### Comment 3
<location> `astrbot/core/knowledge_base/kb_mgr.py:131-135` </location>
<code_context>
+                    chunker=CHUNKER,
+                )
+                await kb_helper.initialize()
+            except Exception:
+                await session.refresh(kb)
+                await session.delete(kb)
+                await session.commit()
+                raise
+        # 判断是否成功创建
</code_context>

<issue_to_address>
**suggestion (performance):** The extra `session.refresh(kb)` in the exception handler seems unnecessary and could be dropped.

Because `kb` is immediately deleted in the same session, the refresh doesn’t change the behavior and only adds an extra DB round-trip. Unless you specifically need the latest state before deleting, this call can be removed.

```suggestion
            except Exception:
                await session.delete(kb)
                await session.commit()
                raise
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@Li-shi-ling
Copy link
Contributor Author

手顺把注释改了,现在改回去了🏃

@Li-shi-ling
Copy link
Contributor Author

示例五号 这个新x修改后的测试

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jan 11, 2026
@Soulter Soulter merged commit 9a91f2f into AstrBotDevs:master Jan 11, 2026
6 checks passed
@Soulter Soulter changed the title fix: ensure atomic creation of knowledge base with proper cleanup on … fix: ensure atomic creation of knowledge base with proper cleanup on failure Jan 11, 2026
@Li-shi-ling Li-shi-ling deleted the fix/4403-kb-creation-transaction branch January 12, 2026 01:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature:knowledge-base The bug / feature is about knowledge base lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants